Estimation of Entropy from Subword Complexity

نویسنده

  • Lukasz Debowski
چکیده

Subword complexity is a function that describes how many different substrings of a given length are contained in a given string. In this paper, two estimators of block entropy are proposed, based on the profile of subword complexity. The first estimator works well only for IID processes with uniform probabilities. The second estimator provides a lower bound of block entropy for any strictly stationary process with the distributions of blocks skewed towards less probable values. Using this estimator, some estimates of block entropy for natural language are obtained, confirming earlier hypotheses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Subword Complexity of Profinite Words and Subgroups of Free Profinite Semigroups

We study free profinite subgroups of free profinite semigroups of the same rank using, as main tools, iterated implicit operators, subword complexity and the associated entropy.

متن کامل

A New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal

The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...

متن کامل

A New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal

The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...

متن کامل

Discharge Estimation by using Tsallis Entropy Concept

Flow-rate measurement in rivers under different conditions is required for river management purposes including water resources planning, pollution prevention, and flood control. This study proposed a new discharge estimation method by using a mean velocity derived from a 2D velocity distribution formula based on Tsallis entropy concept. This procedure is done based on several factors which refl...

متن کامل

Subword complexity and power avoidance

We begin a systematic study of the relations between subword complexity of infinite words and their power avoidance. Among other things, we show that – the Thue-Morse word has the minimum possible subword complexity over all overlapfree binary words and all (73)-power-free binary words, but not over all ( 7 3) +-power-free binary words; – the twisted Thue-Morse word has the maximum possible sub...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016